Goto

Collaborating Authors

 skip-gram model


Distributed Representations of Words and Phrases and their Compositionality

Neural Information Processing Systems

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several improvements that make the Skip-gram model more expressive and enable it to learn higher quality vectors more rapidly. We show that by subsampling frequent words we obtain significant speedup, and also learn higher quality representations as measured by our tasks. We also introduce Negative Sampling, a simplified variant of Noise Contrastive Estimation (NCE) that learns more accurate vectors for frequent words compared to the hierarchical softmax. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases.


AdvSGM: Differentially Private Graph Learning via Adversarial Skip-gram Model

Zhang, Sen, Ye, Qingqing, Hu, Haibo, Xu, Jianliang

arXiv.org Artificial Intelligence

--The skip-gram model (SGM), which employs a neural network to generate node vectors, serves as the basis for numerous popular graph embedding techniques. However, since the training datasets contain sensitive linkage information, the parameters of a released SGM may encode private information and pose significant privacy risks. Differential privacy (DP) is a rigorous standard for protecting individual privacy in data analysis. Nevertheless, when applying differential privacy to skip-gram in graphs, it becomes highly challenging due to the complex link relationships, which potentially result in high sensitivity and necessitate substantial noise injection. T o tackle this challenge, we present AdvSGM, a differentially private skip-gram for graphs via adversarial training. Our core idea is to leverage adversarial training to privatize skip-gram while improving its utility. T owards this end, we develop a novel adversarial training module by devising two optimizable noise terms that correspond to the parameters of a skip-gram. By fine-tuning the weights between modules within AdvSGM, we can achieve differentially private gradient updates without additional noise injection. Extensive experimental results on six real-world graph datasets show that AdvSGM preserves high data utility across different downstream tasks. Graph embedding, which has attracted increasing research attention, represents nodes by low-dimensional vectors while preserving the inherent properties and structures of the graph. In this way, well-studied machine learning algorithms can be easily applied for further mining tasks like clustering, classification, and prediction. Skip-gram models (SGMs) are a popular class of graph embedding models thanks to their simplicity and effectiveness, including DeepWalk [1], LINE [2], and node2vec [3]. However, SGMs, which capture not only general data characteristics but also specific details about individual data, are vulnerable to adversarial attacks, particularly user-linkage attacks [4] that exploit the linkage information between nodes to infer whether an individual is present in the training dataset. Therefore, node embeddings need to be sanitized with privacy guarantees before they can be released to the public. Differential privacy (DP) [5] is a well studied statistical privacy model recognized for its rigorous mathematical underpinnings. In this paper, we study the problem of achieving privacy-preserving skip-gram for graphs under differential privacy.


Learning Complex Word Embeddings in Classical and Quantum Spaces

Harvey, Carys, Clark, Stephen, Brown, Douglas, Meichanetzidis, Konstantinos

arXiv.org Artificial Intelligence

We present a variety of methods for training complex-valued word embeddings, based on the classical Skip-gram model, with a straightforward adaptation simply replacing the real-valued vectors with arbitrary vectors of complex numbers. In a more "physically-inspired" approach, the vectors are produced by parameterised quantum circuits (PQCs), which are unitary transformations resulting in normalised vectors which have a probabilistic interpretation. We develop a complex-valued version of the highly optimised C code version of Skip-gram, which allows us to easily produce complex embeddings trained on a 3.8B-word corpus for a vocabulary size of over 400k, for which we are then able to train a separate PQC for each word. We evaluate the complex embeddings on a set of standard similarity and relatedness datasets, for some models obtaining results competitive with the classical baseline. We find that, while training the PQCs directly tends to harm performance, the quantum word embeddings from the two-stage process perform as well as the classical Skip-gram embeddings with comparable numbers of parameters. This enables a highly scalable route to learning embeddings in complex spaces which scales with the size of the vocabulary rather than the size of the training corpus. In summary, we demonstrate how to produce a large set of high-quality word embeddings for use in complex-valued and quantum-inspired NLP models, and for exploring potential advantage in quantum NLP models.


Distributed Representations of Words and Phrases and their Compositionality

Neural Information Processing Systems

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.


Distributed Representations of Words and Phrases and their Compositionality

Neural Information Processing Systems

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several improvements that make the Skip-gram model more expressive and enable it to learn higher quality vectors more rapidly. We show that by subsampling frequent words we obtain significant speedup, and also learn higher quality representations as measured by our tasks. We also introduce Negative Sampling, a simplified variant of Noise Contrastive Estimation (NCE) that learns more accurate vectors for frequent words compared to the hierarchical softmax. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases.


An FPGA-Based Accelerator for Graph Embedding using Sequential Training Algorithm

Sunaga, Kazuki, Sugiura, Keisuke, Matsutani, Hiroki

arXiv.org Artificial Intelligence

A graph embedding is an emerging approach that can represent a graph structure with a fixed-length low-dimensional vector. node2vec is a well-known algorithm to obtain such a graph embedding by sampling neighboring nodes on a given graph with a random walk technique. However, the original node2vec algorithm typically relies on a batch training of graph structures; thus, it is not suited for applications in which the graph structure changes after the deployment. In this paper, we focus on node2vec applications for IoT (Internet of Things) environments. To handle the changes of graph structures after the IoT devices have been deployed in edge environments, in this paper we propose to combine an online sequential training algorithm with node2vec. The proposed sequentially-trainable model is implemented on a resource-limited FPGA (Field-Programmable Gate Array) device to demonstrate the benefits of our approach. The proposed FPGA implementation achieves up to 205.25 times speedup compared to the original model on CPU. Evaluation results using dynamic graphs show that although the original model decreases the accuracy, the proposed sequential model can obtain better graph embedding that can increase the accuracy even when the graph structure is changed.


#NeurIPS2023 outstanding papers

AIHub

The thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023) is underway in New Orleans. At the official opening session of the conference on Monday evening, the outstanding papers were announced. The awards comprised two outstanding main track paper awards, two outstanding main track runner-ups, two outstanding datasets and benchmark track papers, and the annual test of time award. Abstract: We propose a scheme for auditing differentially private machine learning systems with a single training run. This exploits the parallelism of being able to add or remove multiple training examples independently.


Research Papers for NLP Beginners - KDnuggets

#artificialintelligence

If you're new to the world of data and have a particular interest in NLP (Natural Language Processing), you're probably looking for resources to help grasp a better understanding. You have probably come across so many different research papers and are sitting there confused about which one to choose. Because let's face it, they're not short and they do consume a lot of brain power. So it would be smart to choose the right one that will benefit your path to mastering NLP. I have done some research and have collected a few NLP research papers that have been highly recommended for newbies in the NLP area and overall NLP knowledge.


All you need to know about the word2vec vectorisation

#artificialintelligence

Converting any form of data into numerical data has been an all time. In NLP alone there has been extensive research in the field of vectorising the natural language into numerical data. In this series of complete NLP blog series, let's do a deep dive into word2vec models and their variants. In 2013, Google introduced word2vec model that took the NLP world for a storm . Unlike the TF-IDF and Bag of words model, the most important aspect of word2vec model was, for the first time, vectorising a word considered the semantic outlook .


Skip-Gram Model

#artificialintelligence

Natural Language Processing is the popular field of Artificial Intelligence. We go to process human language as text or speech to make computers alike humans in this process. Humans have a big amount of data written in a much careless format. That is a problem for any machine to find meaning from raw text. We essential to transforming this data into a vector format to make a machine learn from the raw text.